Skip to content

FE-847: DX introspection tier 2#202

Merged
lunelson merged 1 commit into
nextfrom
ln/fe-847-dx-introspection-tier-2
Jun 11, 2026
Merged

FE-847: DX introspection tier 2#202
lunelson merged 1 commit into
nextfrom
ln/fe-847-dx-introspection-tier-2

Conversation

@lunelson

@lunelson lunelson commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Stack Context

This PR starts the FE-847 Tier-2 DX introspection work above the elicitation-gaps stack.

What?

  • Locks the turn-boundary choreography in SPEC/PLAN and cards.
  • Adds the Tier-2 faux harness and Pi introspection seams.
  • Threads assistant-visible watermarks, continuity classification, mention ledger, and next-turn preparation through session tests.

lunelson commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@lunelson lunelson marked this pull request as ready for review June 11, 2026 10:13
Copilot AI review requested due to automatic review settings June 11, 2026 10:13
@cursor

cursor Bot commented Jun 11, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Touches live Pi session boundaries and provider-request guards; incorrect continuity ordering could affect every model turn, though behavior is covered by new extension/graph tests.

Overview
Documents and partially wires FE-847 turn-boundary choreography: SPEC/PLAN/HANDOFF lock D76–D78 and I45–I47, add closure cards for reconciliation vs kick/seeding, and keep a skipped Tier-2 scaffold in tier-2-harness.test.ts as the live coverage map.

Runtime changes: prepareNextTurn runs on the ordered session-boundary pipeline (session_start / before_agent_start / assistant message_start); before_provider_request only guards (fails if continuity drift remains). Graph tools append watermark carriers (brunch.own_mutation, brunch.graph_overview_snapshot). Dev introspection under BRUNCH_DEV mirrors the final system prompt and Brunch-owned text tool results into launch-cwd .brunch/debug/ (D69/I42). Prompting tests drop obsolete readinessGrade from fixtures.

Reviewed by Cursor Bugbot for commit f100a96. Bugbot is set up for automated code reviews on this repo. Configure here.

@lunelson lunelson changed the title spec/plan: lock Tier-2 turn-boundary choreography (D76-D78, I45-I47) pre-scope FE-847: DX introspection tier 2 Jun 11, 2026
Comment thread src/.pi/brunch-pi-extensions.ts
Comment thread src/.pi/brunch-pi-extensions.ts

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances FE-847 “Tier-2 DX introspection” and the turn-boundary choreography layer by introducing (a) assistant-visible watermark projection + continuity entry taxonomy, (b) a prepareNextTurn reconciler scaffold (worldUpdate + drains + mention staleness hooks), (c) mention-ledger capture at submit-time, (d) a Tier-2 real-boot faux harness and dev-only introspection debug cache mirroring, and (e) plumbing these seams into Pi extension lifecycle boundaries and RPC session methods.

Changes:

  • Add assistant-visible watermark + continuity-entry classifier, plus session-boundary origination helper (startAssistantTurn) and reconciler (prepareNextTurn).
  • Introduce mention ledger extraction/resolution utilities and start recording mentions on session.submitMessage.
  • Expand DX tooling: Tier-2 harness for real boot + faux turn + transcript inspection; introspection debug-cache mirroring of system prompt and select tool results; lifecycle pipeline wiring.

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/session/start-assistant-turn.ts New assistant-origination decision helper + context seed insertion logic.
src/session/start-assistant-turn.test.ts Unit coverage for origination and tail-debt classification behavior.
src/session/prepare-next-turn.ts New pre-turn reconciler computing worldUpdate, drains, and mention-staleness hints.
src/session/prepare-next-turn.test.ts Unit coverage for prepareNextTurn, drains, mention staleness, and guard retry loop.
src/session/mention-ledger.ts New mention parsing, submit-time resolution to stable ids, and staleness hint helpers.
src/session/mention-ledger.test.ts Unit tests for handle extraction, resolution, and staleness emission.
src/session/README.md Session-domain docs updated to describe turn-boundary choreography ownership.
src/projections/session/continuity-entry-classifier.ts New shared taxonomy for watermark carriers vs continuity-only vs debt-bearing entries.
src/projections/session/assistant-visible-watermark.ts New projection for assistant-visible {specId, lsn} watermark + safe comparisons.
src/projections/session/assistant-visible-watermark.test.ts Unit tests locking carrier set and cross-spec failure behavior.
src/projections/README.md Projection ledger updated for new watermark and classifier projections.
src/rpc/methods/session.ts Thread origination seeding into session.triggerExchange and append mention ledger on submit.
src/dev/tier-2-harness.ts Tier-2 real-boot harness via runBrunchTui, faux-provider turn, transcript capture, resume fixture helper.
src/dev/tier-2-harness.test.ts Tier-2 harness tests + scaffold describe.skip coverage map for FE-847 invariants.
src/dev/README.md Dev-loop docs updated with Tier-2 real-boot loop and proof ownership ledger.
src/dev/index.ts Re-export Tier-2 harness helpers from dev front door.
src/dev/faux-harness.ts Capture provider contexts for Tier-1 assertions; allow passing resourceLoader/settingsManager for real composed payloads.
src/dev/faux-harness.test.ts New Tier-1 provider-context capture assertions + Brunch-composed payload capture proof.
src/app/brunch-tui.ts Thread dev introspection options + debug-cache location into TUI boot when BRUNCH_DEV is enabled.
src/app/brunch-tui.test.ts Adjust tests for introspection debugCache and new event capture expectations; remove now-moved boot seam test.
src/.pi/README.md Document session-boundary pipeline ordering and graph watermark stamping.
src/.pi/extensions/session/lifecycle.ts Introduce ordered session-boundary pipeline and wire it to session_start/before_agent_start/assistant message start.
src/.pi/extensions/session/lifecycle.test.ts Unit tests for pipeline ordering and event registration.
src/.pi/extensions/introspection/README.md Update docs to include tool_result mirroring + .brunch/debug cache behavior.
src/.pi/extensions/introspection/index.ts Add debug-cache mirroring on before_provider_request and tool_result events.
src/.pi/extensions/introspection/debug-cache.ts New .brunch/debug/ cache writer for system prompt and selected tool text results.
src/.pi/extensions/graph/index.ts Stamp watermark carriers for own mutations + full graph-overview reads.
src/.pi/brunch-pi-extensions.ts Wire prepareNextTurn into the session boundary pipeline and add a before_provider_request continuity guard.
src/.pi/tests/prompting.test.ts Update promptContext shape in tests (readiness grade removal).
src/.pi/tests/introspection.test.ts Add debug-cache mirroring tests and update event registration expectations.
src/.pi/tests/graph-tools.test.ts Assert watermark-carrier entries are appended for mutate_graph and read_graph overview.
src/.pi/tests/extension-registry.test.ts Assert boundary-prep wiring and before_provider_request guard-only behavior.
memory/SPEC.md Lock D76–D78 and related invariants for continuity/origination choreography.
memory/PLAN.md Update plan/frontier definitions and FE-847 slice map for Tier-2 chassis and closures.
memory/cards/turn-boundary-reconciliation--continuity-chain.md New closure card for reconciliation/watermark/mention end-to-end proof and compaction watermark preservation.
memory/cards/kick-and-context-seeding--honest-origination.md New closure card for origination + context seeding end-to-end proof.
HANDOFF.md New volatile handoff capturing FE-847 sequencing and scaffold edge cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +67 to +87
if (message?.role === 'toolResult') {
const toolName = typeof message.toolName === 'string' ? message.toolName : '';
return toolName.startsWith('request_') && responseStatus(message) !== 'answered';
}
return false;
}
return false;
}

function responseStatus(message: Record<string, unknown>): string | undefined {
const details = isRecord(message.details)
? message.details
: isRecord(message.data)
? message.data
: undefined;
return typeof details?.status === 'string' ? details.status : undefined;
}

function messageRecord(entry: TranscriptEntryLike): Record<string, unknown> | undefined {
return isRecord(entry.message) ? entry.message : undefined;
}
Comment on lines +210 to +221
function prepareNextTurnForGraph(
graph: BrunchGraphDeps,
sessionManager: SessionManager,
): PrepareNextTurnResult {
const snapshot = graph.reads.queryGraph(undefined, { visibility: 'all' });
return prepareNextTurn({
specId: graph.specId,
currentLsn: snapshot.lsn,
entries: sessionManager.getEntries(),
changes: graphChangesFromSnapshot(graph.specId, snapshot),
});
}
Comment thread src/.pi/brunch-pi-extensions.ts
@lunelson lunelson force-pushed the ln/fe-844-elicitation-gaps-ii branch from c2ddcdb to f8a3245 Compare June 11, 2026 14:36
@lunelson lunelson force-pushed the ln/fe-847-dx-introspection-tier-2 branch from 50e5001 to 8e0d89d Compare June 11, 2026 14:36

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 8e0d89d. Configure here.

Comment thread src/.pi/brunch-pi-extensions.ts
Comment thread src/.pi/brunch-pi-extensions.ts
Comment thread src/.pi/extensions/graph/index.ts
Base automatically changed from ln/fe-844-elicitation-gaps-ii to next June 11, 2026 14:38
…pre-scope

Final oracle pre-scope review folded in:
- D78-L/I46-L: resume-debt ignore set now covers reconciler-inserted
  side-task & reviewer drains (D15-L), generalized to any notice owing
  no assistant continuation
- S0 scaffold: shared continuity-entry classifier stub
  (isWatermarkCarrier / isContinuityOnlyNonDebtEntry) so S1/S2 and S4
  share one taxonomy; assert worldUpdate/watermark/kick as sets and
  {specId,lsn} properties, not payload-order goldens

PLAN: all S0-S5 build on single FE-847 branch; coverage-first scaffold.
HANDOFF: records final oracle pass; edge-case ledger now 8.
Amp-Thread-ID: https://ampcode.com/threads/T-019eb232-6e53-74a2-9f95-fed451e47fa6
Co-authored-by: Amp <amp@ampcode.com>
@lunelson lunelson force-pushed the ln/fe-847-dx-introspection-tier-2 branch from 8e0d89d to f100a96 Compare June 11, 2026 14:38
@lunelson lunelson merged commit 0f69577 into next Jun 11, 2026
5 of 6 checks passed
@lunelson lunelson deleted the ln/fe-847-dx-introspection-tier-2 branch June 11, 2026 14:39
lunelson added a commit that referenced this pull request Jun 11, 2026
* Sync planning docs after FE-847 restack

Amp-Thread-ID: https://ampcode.com/threads/T-019eb2e2-5c62-7388-8691-f8e04d4b6e50
Co-authored-by: Amp <amp@ampcode.com>

* fable ln-induct review and re-scope

* Flip I45 continuity guard live

* Thread mention continuity through live submit path

* Preserve watermark carriers across compaction

* Seed and kick new sessions on real boot

* Classify resume origination debt

* Require live elicitation gap readers

* Harden elicitation gap predicates

* Sweep localized review fixes

* Handle absent prompt gaps safely

* Restore PLAN honesty for FE-847 residual closure

The kick-and-context-seeding frontier was marked done while its four I46
resume-origination scaffold rows and two I47 idempotence rows remain it.todo
in the Tier-2 suite. Revert it to active with an honest pointer, note the
I47 residue on turn-boundary-reconciliation, and file the remediation
sequence as memory/REFACTOR.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Make runtime-state append contract honest

No caller consumes the appended entry id, and the extension-API write
channel (pi.appendEntry) cannot supply one. Change the session-manager
seam and appendBrunchAgentRuntimeSwitch to void, make
appendBrunchAgentRuntimeInit return an appended/skipped boolean (the
only meaningful sentinel it carried), and delete the hardcoded
placeholder id in the commands adapter.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Extract exhaustive gap-predicate semantics owner

gapPredicateSupport in the union's owning schema module classifies every
arm (structural / manual / unsupported) behind a never check; boundary
validation and coverage derivation both ride it. Adding a GapPredicate
arm without deciding its semantics is now a compile error, and a
structural arm without a derivation fails loud at read instead of
silently deriving 0. Behavior-preserving.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Finish scoped offline env contract: set skip-version-check, drop dead dev flag

applyBrunchOfflineDefault now sets PI_SKIP_VERSION_CHECK alongside
PI_OFFLINE (the save/restore scaffolding's intent — offline launches
emit no version-check noise), never overriding user-provided values.
The dev flag on runWithScopedBrunchOfflineDefault was accepted but never
read; removed. Env tests assert set-during-run and restore-after for
both variables.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Refresh chrome footer after runtime posture switches

The footer already re-projects strategy/lens from the transcript at
render time; nothing requested a render after /brunch:strategy or
/brunch:lens, so the footer kept showing launch-time values. Wire a
chrome-refresh handle at the composition root: chrome binds its
footer render-request into it, and a successful runtime switch calls
it (not on rejection or picker cancel).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Echo projected mode in /brunch:mode no-op message

The already-current branch hardcoded 'elicit' instead of echoing the
projected operational mode; behavior-identical today, honest when the
mode vocabulary grows.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Require graph reads on the prompt context; fail loud on empty gap register

Reverts the 'Handle absent prompt gaps safely' patch (bbc4b4e6) and
removes the ?? [] fallback it was shielding. graphReads is now a
required, documented must-wire member of BrunchPromptContext — a
composition root that omits it is a type error — while session/context
are documented intended-optional. An empty gap register reaching
legality derivation now surfaces through the existing
missing-register-kind throw (the contract isCapabilityLegalForGaps
already documents) instead of quietly returning empty manifests and
axis options: every spec is seeded with floor gaps, so empty means
wiring bug, not posture.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Pin live gap legality through the Tier-2 real-boot oracle

The missing card acceptance from the live-gap-legality fix: a real
runBrunchTui boot over a fresh seeded spec derives turn-boundary tool
legality from that spec's actual gap coverage — uncovered floor gaps
keep capability-gated tools (mutate_graph) locked, a foreign writer
covering the grounding floor unlocks them on the next boundary, and
elicit mode never advertises bash either way.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Derive post-switch tool posture from real selected-spec gaps

applyRuntimeSwitch recomputed active tools with a hardcoded empty gap
register, silently floor-locking capability-gated tools until the next
turn boundary corrected it — the same optional-wiring fault family this
remediation targets. The commands seam now requires a gap reader; the
composition root derives it from the graph deps (selected-spec reads)
or, with no graph in the composition, the explicitly named
conservativeUncoveredFloorGaps fail-closed posture.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Flip the I46 resume-origination scaffold rows live through real boot

Adds bootTier2RuntimeFromFixture (the resume-side real-boot chassis) and
replaces the four I46 it.todo rows with live proofs: a user tail earns
the kick behind reconciler-inserted continuity notices — including after
earlier completed exchanges; request_* leaves stay idle for all three
terminal envelopes plus assistant/system leaves; crash-after-notice
reboot still kicks unresolved debt without duplicating the seed; and
trailing side-task/reviewer drains neither manufacture nor mask debt.

Two product fixes the live rows forced:
- seedAndKickAssistantTurn no longer blanket-suppresses the kick when
  any past exchange result exists (which silently broke post-exchange
  resume kicks); origin now derives from projected transcript state
  (no conversational message entries = new session), with re-kick
  dedupe falling out of the debt classifier itself.
- latestTailOwesAssistant reads the real request_* result envelope:
  outcome is answered/cancelled/unavailable key presence (as
  projections/exchanges actually writes it), not a status string —
  settling the PR #202 responseStatus question: the bot was right,
  an answered request tail would have re-kicked on resume.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Flip the I47 idempotence scaffold rows live through real restart

Adds rebootTier2Runtime (flushes Pi's deferred JSONL, then re-boots the
real runtime over the same session file) and replaces the remaining
it.todo rows: the dedicated no-redundant-worldUpdate-after-seed proof
runs through real boot + provider preflight; boot/resume dedupe is
proven across an actual restart (seed, kick, and worldUpdate all
non-duplicated, derived purely from transcript projection); and the
sets-and-{specId,lsn} suite convention is enforced mechanically by a
source scan banning golden matchers in this suite. The Tier-2 scaffold
has no skipped or todo rows left.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Reconcile PLAN and REFACTOR state after FE-847 remediation closure

Both FE-847 frontiers are now honestly done: every I46/I47 Tier-2
scaffold row runs live, with the resume-side and idempotence proofs
through real boot/restart. REFACTOR.md remains only as the carrier for
the suspended migration-0004 item handed to the stacked branch.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* File typing-collapse refactor plan for the exchanges editor seam

Replaces the completed review-fix remediation plan in REFACTOR.md with
the /expert-typescript-typing findings: one canonical editor envelope
schema (the probe-side fallback is drift), a projected outcome union,
and one grounding-gap fixture builder shared by production and tests.
Carries the suspended migration-0004 item forward.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Extract the canonical request_choices editor-envelope schema

The product editor envelope (schema name
brunch.structured_exchange.request_choices.editor) moves from a
hand-written interface + parser inside the request_choices tool to a
zod schema co-located with the request details schemas. The prefill
template now types against the schema input, the response type is
inferred, and parsing is schema safeParse. A round-trip test locks
prefill -> edited response -> parse -> projection. The new exchanges
README documents the two-envelope rationale (editor wire status vs
transcript outcome-key presence).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Extract the request outcome-union owner from the details schemas

RequestOutcomeKey is now projected from the request details union
branches (KeysOfUnion minus header/tool_meta), with the exported
REQUEST_OUTCOME_KEYS list drift-coupled to the schema in both
directions via a satisfies Record marker. All four request projection
input types consume it, the editor envelope statuses become an
Exclude<RequestOutcomeKey, 'unavailable'> projection, and the session
debt classifier derives its terminal-keys check from the
projections/exchanges re-export instead of restating literals.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Converge the RPC proof on the canonical envelope and delete the fallback

The structured-exchange RPC proof now drives the product
request_choices editor flow (requestChoicesViaEditor, extracted from
the tool and shared by both callers) instead of the divergent
probe-only envelope. The shared/editor-fallback.ts module — its
envelope, parser, hand-written types, and single-select arm — is
deleted along with its index re-exports and helper tests; multi-choice
coverage through the one schema replaces the single-select arm.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Extract the grounding-gap fixture builder into graph/schema

One builder module (src/graph/schema/elicitation-gap-fixtures.ts) now owns
the synthetic ElicitationGap shape: presenceGap for single gaps and
groundingFloorGaps for the context/thesis/goal/constraint floor with a
per-kind coverage knob. The runtime extension's fail-closed
conservativeUncoveredFloorGaps rides the builder (keeping its name, export,
and doc comment), and the eleven hand-cloned per-test-file gap literals are
deleted in favor of importing it. Production owns the shape; tests import
it — never the reverse.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Mark typing-collapse refactor done; suspended migration item remains

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Retire REFACTOR.md; carry the migration handoff note into PLAN

All refactor steps are done; the one suspended item (migration 0004
coherence, owned by the stacked successor branch) moves to PLAN's
Active section so the reintegration re-check survives the file.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* ln-sync: reconcile canonical docs after the FE-847 closure arc

SPEC: I45/I46/I47 invariant and verification-design rows flip from
planned/coverage-first-scaffold to covered with 2026-06-11 evidence;
D35-L reconciled to the shipped, test-locked startup-header behavior
(every non-cancel activation headers; resume/open-stay-quiet clause
superseded; expand affordance removed until an input path exists);
A27-L gains the predicate-hardening evidence (gapPredicateSupport
owner, loud field/coverage rejection, presence kind-floor dedup,
hydration consistency); new Acknowledged Blind Spots row for
live-vs-harness wiring divergence with its mitigations and revisit
trigger.

PLAN: 12 done frontier definitions archived to PLAN_HISTORY as dated
pointer bullets (835 -> 543 lines); completed Sequencing subsections
collapsed into a Recently Completed section; stale active-track
reference repaired.

GC: stale memory/cards/tooling--runtime-state-commands.md deleted
(pickers/overlays shipped; the card's non-scope claims were drift).

READMEs: src/dev Tier-2 harness ledger gains the resume/reboot chassis
entries and the scaffold-fully-live note.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Graduate two induct lenses into ln-review contract-integrity catalog

Per user approval: the optional-hook live-wiring divergence lens (four
findings this arc) and the dark-union-variant lens (the gap-predicate
family) join the stabilized lens library with their cues, repairs, and
graduation evidence.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants